142 research outputs found

    Real-Time RGB-D based Template Matching Pedestrian Detection

    Full text link
    Pedestrian detection is one of the most popular topics in computer vision and robotics. Considering challenging issues in multiple pedestrian detection, we present a real-time depth-based template matching people detector. In this paper, we propose different approaches for training the depth-based template. We train multiple templates for handling issues due to various upper-body orientations of the pedestrians and different levels of detail in depth-map of the pedestrians with various distances from the camera. And, we take into account the degree of reliability for different regions of sliding window by proposing the weighted template approach. Furthermore, we combine the depth-detector with an appearance based detector as a verifier to take advantage of the appearance cues for dealing with the limitations of depth data. We evaluate our method on the challenging ETH dataset sequence. We show that our method outperforms the state-of-the-art approaches.Comment: published in ICRA 201

    Exploring Subtasks of Scene Understanding: Challenges and Cross-Modal Analysis

    Get PDF
    Scene understanding is one of the most important problems in computer vision. It consists of many subtasks such as image classification for describing an image with one word, object detection for finding and localizing objects of interest in the image and assigning a category to each of them, semantic segmentation for assigning a category to each pixel of an image, instance segmentation for finding and localizing objects of interest and marking all the pixels belonging to each object, depth estimation for estimating the distance of each pixel in the image from the camera, etc. Each of these tasks has its advantages and limitations. These tasks have a common goal to achieve that is to understand and describe a scene captured in an image or a set of images. One common question is if there is any synergy between these tasks. Therefore, alongside single task approaches, there is a line of research on how to learn multiple tasks jointly. In this thesis, we explore different subtasks of scene understanding and propose mainly deep learning-based approaches to improve these tasks. First, we propose a modular Convolutional Neural Network (CNN) architecture for jointly training semantic segmentation and depth estimation tasks. We provide a setup suitable to analyze the cross-modality influence between these tasks for different architecture designs. Then, we utilize object detection and instance segmentation as auxiliary tasks for focusing on target objects in complex tasks of scene flow estimation and object 6d pose estimation. Furthermore, we propose a novel deep approach for object co-segmentation which is the task of segmenting common objects in a set of images. Finally, we introduce a novel pooling layer that preserves the spatial information while capturing a large receptive field. This pooling layer is designed for improving the dense prediction tasks such as semantic segmentation and depth estimation

    Transmetteurs photoniques sur silicium pour la prochaine génération de réseaux optiques

    Get PDF

    Analyzing Modular CNN Architectures for Joint Depth Prediction and Semantic Segmentation

    Full text link
    This paper addresses the task of designing a modular neural network architecture that jointly solves different tasks. As an example we use the tasks of depth estimation and semantic segmentation given a single RGB image. The main focus of this work is to analyze the cross-modality influence between depth and semantic prediction maps on their joint refinement. While most previous works solely focus on measuring improvements in accuracy, we propose a way to quantify the cross-modality influence. We show that there is a relationship between final accuracy and cross-modality influence, although not a simple linear one. Hence a larger cross-modality influence does not necessarily translate into an improved accuracy. We find that a beneficial balance between the cross-modality influences can be achieved by network architecture and conjecture that this relationship can be utilized to understand different network design choices. Towards this end we propose a Convolutional Neural Network (CNN) architecture that fuses the state of the state-of-the-art results for depth estimation and semantic labeling. By balancing the cross-modality influences between depth and semantic prediction, we achieve improved results for both tasks using the NYU-Depth v2 benchmark.Comment: Accepted to ICRA 201

    Efficiency-speed tradeoff in slow-light silicon photonic modulators

    Get PDF
    he purpose of this paper is twofold. First, we discuss the efficiency-speed tradeoff in slow-light (SL) silicon photonic (SiP) modulators. For this, a comprehensive model for the electro-optic (EO) response of lumped-electrode SL Mach-Zehnder modulators (SL-MZMs) is presented. The model accuracy is verified by comparing it to experiments. Our analysis shows that slowing down the optical wave helps to enhance efficiency by increasing the interaction time between the optical wave and the uniform voltage across lumped electrodes, but at the cost of limiting the EO bandwidth. Then, we investigate SL-MZMs with traveling-wave (TW) electrodes whose dynamic interaction is predicted using a distributed circuit model. Having been solved by the finite-difference time-domain (FDTD) method, the model shows that TW SL-MZMs are capable of improving both efficiency and speed under an optimized SL effect. We also compare SL-MZMs with conventional MZMs (C-MZM) considering a figure of merit (FOM) that combines key parameters such as efficiency, loss, and EO bandwidth. We show that the additional loss of SL waveguides significantly impacts the preferred modulator choice at different baudrates. The second aim of this paper is to examine different design strategies to reduce VÏ€ of C-MZMs in order to meet the requirement of COMS driver using 1) a longer phase shifter, 2) higher doping densities, and 3) the SL effect. It is shown that the SL effect provides the best overall performance among the three. Indeed, only the SL effect offers simultaneous improvement in VÏ€, footprint, and EO bandwidth; the other approaches provide VÏ€ reduction but at the cost of reduced speed or enlarged footprint (or even both)

    Mach-Zehnder silicon photonic modulator assisted by phase-shifted Bragg gratings

    Get PDF
    We experimentally demonstrate a silicon photonic Mach-Zehnder modulator (MZM) assisted by phase-shifted Bragg gratings. Coupled resonators are inserted in the Bragg grating structure to significantly enhance the phase modulation efficiency, while maintaining a wide optical bandwidth compared to other resonator-based modulators. Fabricated using a CMOS-compatible foundry process, the device achieved a small-signal Vπ× L of 0.18 V.cm, which is seven times lower than a conventional silicon MZM fabricated with the same process. The device has a compact footprint, with a length of only 162 μm , and shows a modulation bandwidth of 28 GHz at a reverse bias of 1 V. Non-return-to-zero modulation is demonstrated at 30 Gb/s with a bit-error-rate (BER) below the 7%-overhead forward error correction (FEC) threshold over a bandwidth of 3.5 nm. This bandwidth should translate into an operating temperature range greater than 40 0 C

    Silicon photonic modulator using mode conversion with asymmetric sidewall bragg gratings

    Get PDF
    An asymmetric sidewall grating allows to operate a Bragg modulator in reflection without circulator and with less than 1.5 dB on-chip loss. An asymmetric Y-branch directs the incident TE0 mode to the grating, while the reflected TE1 mode is guided to the drop port
    • …
    corecore